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Abstract 

Background: Saprophytic filamentous fungi are ubiquitous micro-organisms that play an essential role in photosynthetic 
carbon recycling. The wood-decayer Pycnoporus cinnabarinus is a model fungus for the study of plant cell wall 
decomposition and is used for a number of applications in green and white biotechnology. 

Results: The 33.6 megabase genome of P. cinnabarinus was sequenced and assembled, and the 10,442 
predicted genes were functionally annotated using a phylogenomic procedure. In-depth analyses were carried 
out for the numerous enzyme families involved in lignocellulosic biomass breakdown, for protein secretion and 
glycosylation pathways, and for mating type. The P. cinnabarinus genome sequence revealed a consistent repertoire 
of genes shared with wood-decaying basidiomycetes. P. cinnabarinus is thus fully equipped with the classical families 
involved in cellulose and hemicellulose degradation, whereas its pectinolytic repertoire appears relatively limited. In 
addition, P. cinnabarinus possesses a complete versatile enzymatic arsenal for lignin breakdown. We identified several 
genes encoding members of the three ligninolytic peroxidase types, namely lignin peroxidase, manganese peroxidase 
and versatile peroxidase. Comparative genome analyses were performed in fungi displaying different nutritional 
strategies (white-rot and brown-rot modes of decay). P. cinnabarinus presents a typical distribution of all the 
specific families found in the white-rot life style. Growth profiling of P. cinnabarinus was performed on 35 carbon 
sources including simple and complex substrates to study substrate utilization and preferences. P. cinnabarinus 
grew faster on crude plant substrates than on pure, mono- or polysaccharide substrates. Finally, proteomic analyses 
were conducted from liquid and solid-state fermentation to analyze the composition of the secretomes corresponding 
to growth on different substrates. The distribution of lignocellulolytic enzymes in the secretomes was strongly 
dependent on growth conditions, especially for lytic polysaccharide mono-oxygenases. 
(Continued on next page) 
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(Continued from previous page) 

Conclusions: With its available genome sequence, P. cinnabarinus is now an outstanding model system for the study 
of the enzyme machineiy involved in the degradation or transformation of lignocellulosic biomass. 

Keywords: Pycnoporus cinnabarinus, Genome annotation, CAZy, Auxiliary activities, Oxidoreductase, White-rot fungi, 
Lignocellulose 



Background 

Filamentous fungi are a source of powerful enzymes for 
plant biomass breakdown and/or hydrolysis in green and 
white biotechnology, especially biorefining [1]. The enzym- 
atic modification of lignin-derived aromatic compounds is 
of strategic importance both for biomass valorization of 
the other plant-cell-wall compounds in the green chemistry 
sector and for the biotransformation of these aromatic 
compounds into high-value products (foods, cosmetics and 
pharmaceuticals) or industrial compounds (surfactants, ad- 
hesives and biomaterials). 

The proportions of the constituent polymers of plant 
cell walls, i.e. cellulose, hemicelluloses, pectin and lignins, 
fluctuates with botanical origin, tissue, and age of the 
plant. In response to the structural complexity and hetero- 
geneity of the different plant cell wall polymers, sapro- 
phytic fungi produce a complex arsenal of enzymes to 
gain access to the carbon source. Lignocellulolytic fungi 
have traditionally been classified into three main fungal 
groups according to the appearance of the plant material 
remaining after decomposition [2]. Soft- rot fungi partially 
degrade plant polysaccharides by mobilizing cellulases 
and hemicellulases, and cause wood softening [3]. In 
contrast, brown-rot fungi such as Postia placenta pro- 
duce enzymes involved in extracellular generation of 
Fentons reagent, where hydroxyl radicals resulting 
from the reaction between Fe(II) and hydrogen perox- 
ide may ultimately cause cellulose depolymerization 
[4]. Lignin is apparently only slightly modified in this 
process, and remains as a crumbly, brownish material. 
Unlike the above two groups, white- rot fungi are the 
only organisms able to effectively degrade lignin, in a 
process called enzymatic combustion [5] where peroxi- 
dases cooperate with other oxidoreductases [6]. The 
decayed wood resulting from attack by white-rot fungi 
becomes white and stringy. For selective white-rot fungi, 
the white color is caused by rapid hemicellulose and lignin 
breakdown of the cell-wall constituents, followed later by 
cellulose degradation [7]. 

The white-rot fungus Pycnoporus is very efficient at 
completely degrading lignin [8]. The Pycnoporus genus 
belongs to the phylum Basidiomycota, class Agarico- 
mycetes, order Polyporales, family Polyporaceae. The 
genus Pycnoporus is divided into four species with differ- 
ent geographic origins: P. cinnabarinus is widely distrib- 
uted especially in the Northern hemisphere, P. coccineus 



in countries bordering the Indian and Pacific Oceans, 
P. sanguineus in the tropics and subtropics of both hemi- 
spheres, and P. puniceus, a rare species found in Africa, 
India, Malaysia and New Caledonia. Pycnoporus mycelia and 
fruiting bodies are characterized by red-to-orange pigmenta- 
tion due to phenoxazinone pigments including cinnabarin, 
tramesanguin and cinnabarinic acid [9]. P. cinnabarinus is a 
heterothallic homobasidiomycete with a tetrapolar mating 
system. Its life cycle includes a short monokaryotic stage 
after spore germination, followed after mating by an indefin- 
ite dikaryotic stage where karyogamy and meiosis can take 
place. The fungus is able to produce fruiting body structures 
and to generate stable monokaryotic cell-lines amenable to 
genetic improvement by formal genetics and genetic engin- 
eering, e.g. development of expression systems for high- level 
ligninase production [10]. 

P. cinnabarinus has a large array of copper- and 
iron-containing metalloenzymes involved in transform- 
ing plant-cell-wall aromatics [11,12] and harbors ori- 
ginal metabolic pathways involved in functionalizing 
these cell-wall aromatics to yield high-added-value 
compounds such as aromas and antioxidants [13,14]. 
P. cinnabarinus is listed as a food- and cosmetic-grade 
microorganism [15]. Among enzymes involved in lig- 
nin degradation, P. cinnabarinus is known to produce 
high-redox-potential laccase as the predominant en- 
zyme at very high levels of up to 1 g per liter [16,17]. 
The potential of Pycnoporus fungi lies in their laccases 
which find a variety of applications, such as bioconver- 
sion of agricultural by-products and raw plant materials 
into valuable products, biopulping and biobleaching paper 
pulp [18-22], dye bleaching in the textile and dye indus- 
tries [23-25], wastewater treatment [26-28], removal of 
phenolic compounds in beverages [29], biosensor and bio- 
fuel cell construction [30], and producing substances of 
pharmaceutical importance [31]. 

All studies performed over the last decade support the 
Pycnoporus genus as a strong contender for green and 
white biotechnology applications. Here, we describe the 
sequencing and annotation of the P. cinnabarinus mono- 
karyotic cell-line BRFM137 genome, its growth profiling 
and its secretome analyses under different culture condi- 
tions. Lignocellulolytic repertoires of P. cinnabarinus are 
highlighted and compared with other fungal counterparts. 
P. cinnabarinus emerges as a versatile white-rot fungus 
for biotechnological applications. 
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Methods 

Strain, DNA preparation and culture conditions 

Monokaryotic strain P. cinnabarinus BRFM137 [9] was 
obtained from the International Centre of Microbial 
Resources dedicated to Filamentous Fungi (CIRM-CF, 
Marseille, France, http://cirm.esiLuniv-mrs.fr/crbmarseille/ 
pages/index_mizenpage.php). Genomic DNA was isolated 
from ground mycelia powder as described in Lomascolo 
et al [32], and a roughly 180 (ig sample was sent to GATC 
Biotech AG (Constanz, Germany) for genome sequencing. 

For construction of the cDNA library, P, cinnabarinus 
BRFM137 was grown as described in Lomascolo et al 
[17] with five types of substrate: (i) 20 g/L maltose, (ii) 
20 g/1 maltose and 0.1 g/1 ferulic acid, (Hi) 5 g/1 maltose 
and 15 g/1 oat spelt xylan (Sigma), (iv) 5 g/1 maltose and 
15 g/1 autoclaved maize bran (ARD, Pomade, France) 
and (v) 5 g/1 maltose and 15 g/1 Avicel cellulose (Sigma). 
The other constituents of the medium were: diammo- 
nium tartrate (1.84 g/1); disodium tartrate (2.3 g/1); 
KH2PO4 (1.33 g/1); CaCl2-2H20 (0.1 g/1); MgS04 • 
7H2O (0.5 g/1); FeS04 ♦ 7H2O (0.07 g/1); ZnS04 ♦ 7H2O 
(0.046 g/1); MnS04'H20 (0.035 g/1); CuS04-5H20 
(0.007 g/1); yeast extract (1 g/1). After 4-5 days of culti- 
vation, the fungal mycelia from each of the five culture 
conditions were homogenized in liquid nitrogen, and 
total RNA was extracted following a standard phenol/ 
chloroform method [33]. RNA from all the culture 
conditions was pooled and sent to GATC Biotech AG 
for reverse transcription and sequencing via lUumina 
technology. 

For proteomic analysis, liquid cultures (LC) with non- 
immobilized or immobilized mycelia on 2 x 2 x 1 cm poly- 
urethane cubes (10 per vial) were run in 250 ml baffled 
flasks containing 100 ml medium according to Lomascolo 
et al [17]. Three LC conditions were used: (i) 20 g/1 maltose 
(LC-M), (ii) 5 g/1 maltose, 15 g/1 Avicel cellulose (Sigma) 
and 15 g/1 autoclaved maize bran (ARD) (LC-M-MB-A), 
and (Hi) 5 g/1 maltose and 15 g/1 micronized birchwood 
(LC-B). Solid-state fermentation (SSF) cultures were also 
performed with five different substrates: sugarcane ba- 
gasse (Orizaba, Mexico), banana skins, wood shavings 
(Farmer Litter, Weldom, France), hemp (Zolux Litter, 
Weldom, France) and micronized birchwood. Each sub- 
strate was homogenized in water to obtain a moisture 
content of 70% (w/w). Five grams of substrate (wet 
weight) was placed in a 250 ml flask and inoculated 
with 1.2 ml of mycelial suspension (50 ml of nutrient 
medium and two mycelial mats from precultures) ac- 
cording to a protocol adapted from Meza et al [34,35]. 
For each growth condition, culture supernatants were 
harvested after 3, 7 and 10 days of cultivation and then 
pooled. 

For growth profiling on 35 carbon sources, P, cinna- 
barinus BRFM137 was grown on agar plates according 



to Espagne et al [36], using either 10 g/1 simple carbo- 
hydrates or 30 g/1 complex carbohydrates. The kraft- 
lignin was purchased from Sigma (reference: 370959). 

Genome sequencing and data assembly 

The P, cinnabarinus BRFM137 genome was sequenced 
by a combination of methods: (/) sequencing of genomic 
DNA and two normalized cDNA libraries obtained from 
cultures grown on different substrates (maltose, oat spelt 
xylan, cellulose and autoclaved maize bran) using 454/ 
GS Roche FLX Titanium technology, (//) sequencing of 
genomic DNA with Illumina/Solexa Genome Analyzer II 
technology, and (///) sequencing of a 3 kbp paired-end 
genomic library using Illumina/Solexa Genome Analyzer 
II technology. The genomic Roche 454 read sets were 
uploaded to the ng6 storage environment [37]. Reads 
were cleaned using pyrocleaner [38], which applied a 
low-complexity filter followed by a read-size filter (over 
100 bp) and a duplication-removal filter. The 454 reads 
were then assembled using wgs-assembler version 6.0. 
The Illumina mate pair reads were filtered out using the 
contig alignment information. All short aligned read 
pairs and long reads were then reassembled to produce 
contigs and scaffolds using the same assembly software 
versions. The 454 transcriptome reads of P, cinnabari- 
nus were also cleaned using pyrocleaner, but this time 
the duplicated sequences were not filtered out. The 
reads were de novo assembled using tgicl (TIGR Gene 
Indices clustering tools) and annotated using various 
databases. The reads and contigs were aligned on the 
genome using Exonerate to produce gff files. These gff 
files were uploaded to gbrowse (http://genome-browser. 
toulouse.inra.fr:9090/cgi-bin/gb2/gbrowse/). Gene predic- 
tion was performed using Augustus [39] with the fungal 
gene model: Phanerochaete chrysosporium. The corre- 
sponding gff outputs were also uploaded to the gbrowse 
environment. Ensembl fungal transcripts of Aspergillus 
fumigatusy Aspergillus terreus, Aspergillus nidulans, Schi- 
zosaccharomyces pombe, Aspergillus clavatus, Aspergillus 
nigen Aspergillus flavus, Aspergillus oryzae were also 
aligned on the genome and the results uploaded to 
gbrowse. For P, cinnabarinus, a biomart environment was 
set up to link de novo contigs to their genomic alignment 
location (http://genomebrowser.toulouse.inra.fr:9090/ 
biomart/martview). 

All the data are available at the European Nucleotide 
Archive (ENA), EMBL-EBI, Accession number: [EMBL: 
PRJEB5237]. 

Gene prediction and functional annotation 
Orthologous groups construction 

Orthologous groups (OGs) were built by running OrthoMCL 
[40] software on the best protein models from: 1) P. cinna- 
barinus BRFM137, 2) Trametes versicolor, (TaxID: 717944) 
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3) Postia placenta (TaxID: 561896), 4) Phanerochaete 
chrysosporium RP-78 (TaxID: 273507), 5) Schizophyl- 
lum commune H4-8 (TaxID: 578458), 6) Coprinopsis 
cinerea okay ama7# 130 (TaxID: 240176), 7) Laccaria 
bicolor S238N-H82 (TaxID: 486041), 8) Agaricus bis- 
porus (TaxID: 936046), 9) Gloeophyllum trabeum (TaxID: 
670483), 10) Ustilago maydis (TaxID: 5270), 11) Saccharo- 
myces cerevisiae RMll-la (TaxID: 285006), 12) Schizosac- 
charomyces pombe (TaxID: 4896), 13) Aspergillus niger 
(TaxID: 380704), 225 14) Trichoderma reesei QM6a 
(Taxld: 431241), 15) Nectria haematococca (TaxID: 
140110), 16) Neurospora crassa (TaxID: 367110), 17) 
Myceliophthora thermophila (TaxID: 573729), 18) Chaeto- 
mium globosum (TaxID: 306901), 19) Mucor circinel- 
loides (TaxID: 747725), 20) Homo sapiens (TaxID: 9606) 
and 21) Arabidopsis thaliana (TaxID: 3702). Each OG 
is a set of proteins across one or more species in 
the 21 listed genomes that represents putative ortho- 
logs and in-paralogs. AU-versus-all BLASTP was set 
a 10"^ cutoff. 

Global functional annotation 

Global functional annotation was based on the analysis 
of each OG. All 15788 OGs were used as a seed for the 
functional annotation process based on the bioinformat- 
ics initiative Gene Ontology [41]. OGs containing at 
least one sequence from P, cinnabarinus were selected 
(7002 OGs). All sequences included in OG were ordered 
following the species list above. Sequences from each 
OG were queried using BLAST against the NCBI non- 
redundant (NR) protein database. A strict £-value thresh- 
old of 10"^^^ was applied to select homologous se- 
quences retrieved by BlastP. These homologs were mapped 
to the global Gene Ontology annotation files (ftp://ftp.pir. 
georgetown.edu/databases/idmapping/idmapping.tb.gz). 

If GO information was retrieved for the first sequence, 
the process was ended; if no information was retrieved 
for the first sequence in the OG list, the second se- 
quence was used for mapping. In the particular case 
where several sequences were present in the same spe- 
cies, sequences were ordered by length. All the coding 
sequences (CDS) not included in OGs were directly 
BLASTed as described above. 

Identification of repeated sequences 

RepeatScout [42] was used to identify de novo DNA re- 
peats in the P, cinnarinus genome. Default parameters 
(with / = 15) were used. The RepeatScout library was 
then filtered as follows: /) all the sequences less than 
100 bp in size were discarded; //) repeats counting less 
than ten copies in the genome were removed (as they 
may correspond to protein-coding gene families) and ///) 
repeats having significant hits to known proteins in UNI- 
PROT (The UNIPROT Consortium, 2008) other than 



proteins known to belong to transposable elements 
(TEs) were removed. The remaining consensus sequences 
were annotated manually by a TBL ASTX search [43] against 
RepBase [44] to classify them into known TE families. To 
identify full-length long terminal repeat (LTR) retro- 
transposons, a second de novo search was performed 
with LTR_STRUC [45]. The TBLASTX algorithm was to 
check the full-length candidate LTR retrotransposon se- 
quences for homology against the sequences from the 
RepBase database. The number of repeat element occur- 
rences and the percent of genome coverage were assessed 
using RepeatMasker [46] by masking the genome assem- 
bly with the consensus sequences coming from the 
RepeatScout and LTR_STRUC pipelines. MISA (http:// 
pgrc.ipk-gatersleben.de/misa/download/misa.pl) was used 
with default parameters to identify mono- to hexanu- 
cleotide simple sequence repeat (SSR) motifs. Mini- 
satellites (motif of 7 to 100 bp) and satellites (motif 
>100 bp) were searched for in the P, cinnabarinus gen- 
ome using Tandem Repeats Finder software [47] with 
the following parameters: 2; 7; 7; 80; 10; 50; 500. 

Carbohydrate-active enzyme and llgnln degradation 
enzyme annotation 

All putative proteins were compared to the entries in 
the CAZy database [48,49] using BLASTP. The proteins 
with £- values smaller than 0.1 were further screened by 
a combination of BLAST searches against individual pro- 
tein modules belonging to the A A (Auxiliary Activities), 
GH (Glycosyl Hydrolases), GT (GlycosylTransferases), 
PL (Polysaccharide Lyases), CE (Carbohydrate Esterases) 
and CBM (Carbohydrate-Binding Modules) classes (http:// 
www.cazy.org/). HMMer 3 [50] was used to query against 
a collection of custom-made hidden Markov model 
(HMM) profiles constructed for each CAZy family. All 
identified proteins were then manually curated. Within 
families, subfamilies were manually defined according 
to their homology relationships between members of 
the focal family. Protein sequences obtained from auto- 
matic prediction by Augustus software were annotated via 
this procedure, and all identified proteins were then 
manually curated. 

Structural annotation of the corresponding oxidative 
encoding genes (number, size and position of introns) 
was checked manually. To do this, each AA sequence 
detected was BLASTP-searched against the NCBI non- 
redundant database. The results with the most satisfac- 
tory £-values and coverage were retained. 

Then, first the target protein sequence was aligned 
with the sequence previously selected by BLASTP 
using ClustalW (http://www.genome.jp/tools/clustalw/); 
second, the target nucleic acid sequence was translated 
in three reading frames (http://www.ebi.ac.uk/Tools/ 
st/emboss_sixpack/). Gene intron splice sites were 
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determined based on consensus sequences fitting the 
GT-AG rule as described in Breathnach et al, [51]. 

Identification of proteins in secretomes by LC-MS/MS 
analysis 

Proteins from the diafiltered supernatants of P, cinnabari- 
nus BRFM137 cultures were separated by ID SDS-PAGE 
electrophoresis according to the protocol of Couturier 
et al, [52]. After protein trypsinolysis, peptide analysis 
was performed by LC-MS/MS as described in Arfi et al, 
[53] using the PAPPSO platform facilities (Jouy-en-Josas, 
France; http://pappso.inra.fr). Based on the list of peptides, 
proteins were identified by querying the MS/MS data 
against the predicted proteins obtained from the P, cinna- 
barinus genome de novo sequencing data. 

Annotation of protein secretion and glycosylation pathways 

A, niger proteins related to protein secretion and glyco- 
sylation according to Pel et al, [54] and extended with 
additional proteins were used in a BLASTP search to- 
wards the P, cinnabarinus fasta file. The first hits were 
compared to the A, niger proteins to identify bi-directional 
BLAST best hits. An £-value cut-off of 10"^^ was used. 
The description of the gene products was taken from the 
Saccharomyces Genome Database (SGD) after identifying 
the S, cerevisiae orthologs. 

Results and discussion 

Characteristics of the R cinnabarinus genome 

The genome of the monokaryotic strain P, cinnabarinus 
BRFM137 was sequenced by 454 pyrosequencing and Illu- 
mina sequencing runs to reach a final 31 -fold coverage. 
The genome was ultimately assembled into 784 scaffolds 
with N50 of 165118 bp. Table 1 reports the features of the 
assembled genome sequences. The G + C content of the 
P, cinnabarinus genome was 52.55%. Genome size was 
33.67 Mb and a total of 10,442 ORFs were identified in 
the structural annotation procedure. The number of ORFs 

Table 1 Statistical assembly of the P. cinnabarinus 



genome 

Total scaffolds 784 

Total bases in scaffolds 33 133 717 

Total span of scaffolds 33 638 736 

Coverage 31.0916 

Length of genome assembly (Mb) 33.67 

ORFs number 10442 

GC content (%) 52.55 

Average number of exons per gene 6.7 

Average exon size (bp) 257.42 

Average coding sequence size (bp) 1774.36 

N50 scaffold bases 165118 



in P, cinnabarinus is close to the average number among 
the order Polyporales. For instance, Phanerochaete chry- 
sosporiuniy Postia placenta, Wolfiporia cocos and Ceri- 
poriopsis subvermispora count 10048, 12541, 12747 and 
12125 detected ORFs in their genomes, respectively 
[4,6,55,56]. P, cinnabarinus genome size is slightly lower 
than in P, placenta (42.5 Mb), C. subvermispora (39 Mb) 
and W, cocos (50.5 Mb). 

In general, functional annotation hinges on the propa- 
gation of existing functional information via single hom- 
ology searches. The resolution of functional inference 
could be improved by differentiating homologs into ortho- 
logs (homologous genes resulting from a speciation event) 
and paralogs (homologous genes resulting from a duplica- 
tion event) [57]. Orthologs are assumed to have more 
chance of sharing the same function than paralogs. Gene 
duplication is an essential contributing factor for evolving 
novel functions, and one of the duplicates could undergo 
evolutionary events such as sub-functionalization, neo- 
functionalization, etc. (see [58,59] for review). We there- 
fore based our annotation strategy on the searches for 
OGs within 21 selected genomes followed by similarity 
searches from each OG. An outline flow of the functional 
annotation procedure based on this phylogenomic ap- 
proach is shown in Figure 1. 15,788 OGs were retrieved 
using a best reciprocal hit approach. The OGs included 
8,647 putative CDS from P, cinnabarinus, totaling -83% 
of total CDS. Based on a sequence homology searches 
within each OG against the NR database using a strict 
£-value cutoff of 10"^^^ 5,018 genes were annotated 
across the GO categories. In addition, 399 orphan genes 
were annotated using the standard Blast2GO procedure. 
The annotation procedure enabled us to annotate 5,417 
CDS corresponding to -52% of total CDS (Additional file 1: 
Table SI). To compare with the classical method, fewer than 
30% of total CDS were annotated using the Blast2GO pro- 
cedure. Our approach based on ortholog clustering enables 
us to infer functional information directly from OGs using a 
subsequent drastic threshold for similarity searches and of- 
fers a conceptual framework for inferring information from 
various genomes. The 5,417 annotated genes were grouped 
into functional groups (Figure 2). Finally, a GO tree depth 
was calculated to assess amount and quality of GO annota- 
tions (Figure 3). 

Repeated sequences were identified in the genome of 
P, cinnabarinus and a library of 1,118 consensus se- 
quences was generated using RepeatScout [41]. After the 
different filtering steps, we were left with 190 consensus 
sequences: 13, 9, 5 and 5 consensus sequences showed 
homologies with Class 1 gypsy, copia, DIRS and Long 
Interspersed Element (LINE) retrotransposons, respect- 
ively, and 8 with Class 2 transposons (Table 2). The 
remaining 150 consensus sequences were uncategorized. 
Of the 9 putative full-length LTRs identified using 
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Figure 1 Annotation strategy for P. cinnabarinus based on a phylogenomic approach. Orthologous groups (OGs) were formed from 21 
genomes by running the OrthoMCL software using a BLASTP cutoff E- value of le"^. OGs containing at least one sequence from P. cinnabarinus 
were selected (7002 OGs) and used as a seed for the functional annotation process based on the bioinformatics initiative Gene Ontology. 
Sequences from each OG were BLAST-queried against a NCBI non-redundant (NR) protein database using a cutoffs-value of 10"^^°. The mapping 
procedure was carried out with the global Gene Ontology annotation files. The process was ended once GO information was retrieved. For orphan 
genes, the coding sequences were directly annotated using B2Go procedures. 



LTR_STRUC, three were attributed to Gypsy/Ty3-like ele- 
ments and two to Copia/Tyl-like elements. The remaining 
four sequences are excluded from further analyses. Repeat- 
Masker masked 8.21% of the genome assembly: 2.91% by 
repeated elements belonging to unknown/uncategorized 
families, 2.5% by Class 1 Gypsy retrotransposons, 0.95 
by Class 1 Copia retrotransposons, 0.2 by Class 1 DIRS 



retrotransposons, 0.7 by Class 1 LINE retrotransposons 
and 0.95 by Class 2 DNA transposons (Table 2). 

The number of full-length LTR was lower in P, cinna- 
barinus than in other white-rots [6], although the TE gen- 
ome coverage was in the range of other white-rot fungi. A 
total of 1,707 SSRs were identified in the P. cinnabarinus 
genome corresponding to 350 mono-, 380 di-, 820 tri-. 
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Figure 2 Annotation of the P, dnnabarinus genome. Classification sclieme is summarized in tliree main GO categories, i.e. biological process, 
cellular component, molecular function. Some genes have more than one GO annotation. 
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91 tetra-, 25 penta- and 41 hexanucleotide motifs. A 
total of 2368 mini- satellites and 10 satellites were identi- 
fied for a genome coverage of 0.42% and 0.01%, respect- 
ively. The number of microsatellites was in the range of 
those found in other white-rot and Polyporaceae genomes, 
although the genome of P. cinnabarinus was less rich in 
mini-satellite and satellite sequences [60]. 

Carbohydrate metabolism, lignin-degrading oxidoreductases 
and wood decay 

Carbohydrates and lignin are intimately interconnected 
in all land-plant cell-walls. The accessibility of all cell- 
wall components Le, cellulose, hemicellulose, pectin and 
lignin, is strongly limited by the covalent cross-linkages 
of the constituents which create an intricate network 
and a physical barrier that resists microbial breakdown. 
Among the predicted lignin-degrading activities, a total 
of five laccases (AA1_1), one ferroxidase (AA1_2), one 
multicopper oxidase (AAl), nine ligninolytic peroxidases 

Table 2 Number of repeated sequences in the P. 
cinnabarinus genome 



Number of Number of Genome assembly 
families copies coverage 



Class-1 LTR 
Gypsy-like 


16 (13* + 


642 


2.5 


Class-1 LTR 
Copia-like 


11 (9* + 


306 


0.95 


Class-1 DIRS 


5 


62 


0.2 


Class-1 LINE 


5 


163 


0.7 


Class-2 DNA 
Transposons 


8 


251 


0.95 


Uncategorized 


150 


2831 


2.91 


All 


168 


4255 


8.21 



^Number of elements identified by RepeatScout pipeline. 
**Number of elements identified by LTR_STRUC pipeline. 



(AA2) including lignin peroxidases (LiP), manganese 
peroxidases (MnP) and versatile peroxidases (VP), one 
cellobiose dehydrogenase containing an iron reductase 
domain (AA8-AA3_1), three aryl-alcohol oxidases and one 
glucose oxidase (AA3_2), two alcohol oxidases (AA3_3), 
two pyranose oxidases (AA3_4)) seven copper radical oxi- 
dases (AA5_1), one benzoquinone reductase (AA6), and 
one iron reductase domain (AA8) linked to a CBMl were 
identified (Table 3 and Additional file 2: Table S2). P, cin- 
nabarinus was initially considered to lack class-II peroxi- 
dases based on extracellular activities in the culture 
medium [16]. Remarkably, nine class II peroxidases were 
annotated and divided into at least four LiP, three MnP, 
one VP and one atypical VP. On average, white-rot fungi 
have 12 members of the AA2 family (Table 4). The only ex- 
ception is S. commune in which the AA2 family is absent 
[61], although it is considered as a white-rot fungus despite 
limited lignin-degrading ability. Members of family AA2 
can be considered as one of the most important family 
markers to differentiate white-rot and brown-rot fungi, 
since brown-rot (BR) fungi contain no AA2 members 
[6,49]. In addition to class II peroxidases, P, cinnabarinus 
contains several laccases (AA1_1) and one cellobiose de- 
hydrogenase (AA8-AA3_1), meaning that this fungus con- 
tains a complete, versatile ligninolytic enzymatic spectrum. 
A number of enzymes are proposed to supply the hydro- 
gen peroxide required for oxidase activity. Among these, 
the best established candidate is glyoxal oxidase of family 
AA5_1, and P, cinnabarinus has seven candidate gene 
models in this family. Interestingly, P, cinnabarinus also 
possesses several other hydrogen peroxide providers, such 
as GMC oxidoreductases from family AA3_2 which in- 
cludes at least three aryl-alcohol oxidases. In summary, the 
white-rot fungus P, cinnabarinus possesses a complete 
enzymatic arsenal for lignin breakdown. The full set of 
ligninolytic enzymes identified suggests that this fungus 
may exploit different strategies for ligninolysis, including 
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Table 3 Global composition of AA encoding genes found 
in P. cinnabarinus BRFM137 



Family^ 


Known activities 


Total number 


AA1_1 


Laccase 


5 


AA1_2 


Ferroxidase 


1 


AA1 


Multicopper oxidase 


1 


AA2 


Class II peroxidase 


9 (+1 partial) 


AA3_1 


Cellobiose dehydrogenase 


1 


AA3_2^ 


Aryl-alcohol oxidase/ 


19 




Glucose oxidase 




AA3_3 


Alcohol oxidase 


2 


AA3_4 


Pyranose oxidase 


2 


AA5_1 


Glyoxal oxidase 


7 


AA6 


1,4-benzoquinone reductase 


1 


AA8 


Iron reductase domain 


2 


AA9 


Lytic polysaccharide monooxygenase 


15 



^Known (sub)family activities are as follows: AA1_1: laccase, AA1_2: 
ferroxidase, AA1: multicopper oxidase, AA2: class II peroxidase; AA3_1: 
cellobiose dehydrogenase; AA3_2: aryl alcohol oxidase, glucose oxidase; 
AA3_3: alcohol oxidase; AA3_4: pyranose oxidase; AA5_1: glyoxal oxidase, 
copper radical oxidase; AA6: benzoquinone reductase; AA8: iron reductase 
domain; AA9: LPMO. ^ Including 3 AO and 1 GOx. According to [49]. 



oxidation mediated by class II peroxidases requiring 
hydrogen peroxide or by laccases in the presence of 
redox mediators, or via Fenton chemistry [49,62,63]. 

P. cinnabarinus is fully equipped with putative enzymes 
from families classically involved in cellulose degradation 
(GHl, GH3, GH5, GH6, GH7, GH12, GH45) and can grow 
on pure cellulose. However, P, cinnabarinus possesses the 
smallest number of GH members among the white-rot 
fungi. The P, cinnabarinus genome encodes 15 lytic poly- 
saccharide monooxygenases (LPMOs) of family AA9, a 
number similar to that encoded by other white-rot fungal 
genomes (Table 4). The P, cinnabarinus BRFM137 genome 
contains a gene encoding a CDH (ORF scfl85013.gl). This 
gene codes for the CDH already described by Moukha 
et al [11], Sigoillot et al [64] and Bey et al [65]. Concern- 
ing xylan degradation, only two GHIO and two GH43 en- 
zymes were identified in P, cinnabarinus, which is less 
than the average number of representatives in the white- 
rot group (respectively of 5.2 and 9). No members of family 
GH51 could be found in the P, cinnabarinus genome. The 
GH51 family includes a-L-arabinofuranosidases acting on 
terminal non-reducing a-L-arabinofuranose residues in 
arabinose-containing compounds [66]. Terminal arabin- 
ose residues are found in the rhamnogalacturonan I from 
dicot primary cell walls, and glucuronoarabinoxylan from 
grass primary cell walls, so the absence of GH51 could 
partly constrain the complete degradation of hemicellu- 
loses and pectic polysaccharides in P, cinnabarinus and is 
consistent with the lack of such cell wall components in 



wood. The number of other P. cinnabarinus genes encod- 
ing pectinolytic enzymes also seems to be limited. The 
members of family GH28 are fewer than the average num- 
ber found in other fungi, and no representative of family 
GH54 including a-L-arabinofuranosidase was found. Also, 
P, cinnabarinus contains no candidate gene of the pecti- 
nolytic families PLl (pectin/pectate lyase), PL3 (pectate 
lyase), PL9 (pectate lyase), CE12 (rhamnogalacturonan 
acetyl esterase) or GH53 (endo-p-l,4-galactanase). P, cin- 
nabarinus is the only fungus lacking a family GH53 mem- 
ber among the selected white- and brown-rots. Family 
GH53 enzymes degrade galactans and arabinogalactans in 
the pectic component of plant cell walls. This genomic 
repertoire is consistent with the very poor growth of 
P. cinnabarinus observed on apple pectin and citrus 
pectin as substrates (Figure 4). 

In conclusion, white-rot fungi possess more represen- 
tatives of lignocellulolytic enzymes than the brown-rot 
group, especially in the families AA2 (12 vs, 0) AA3_2 
(6.3 vs. 1.4), AA5_1 (6.8 vs. 3.4), AA9 (15.2 vs. 3.8) and 
CBMl (18.8 vs. 1.7) (Table 4). Based on these results, P. 
cinnabarinus clearly belongs to the classical white-rot 
fungi, with a distribution typical of all the specific fam- 
ilies found for this nutritional strategy. 

To study the growth ability of P. cinnabarinus, growth 
profiling was performed on 35 carbon sources, including 
mono-, oligo- or polysaccharides, crude plant biomass, 
casein and lignin, and the profiles were compared with 
the CAZy gene content of the P. cinnabarinus genome. 
On average, growth was better on crude plant biomass 
substrates than on pure mono- oligo- or polysaccharides 
(Figures 4 and 5). Interestingly, growth on cotton seed 
hulls was poor, not only compared with the other plant 
biomass substrates but also compared with several pure 
substrates. Previous studies suggest that this is probably 
due to the high lignin content of cotton seed hulls 
(about 20-25%). The other plant biomass substrates are 
poorer in lignin (2-4%), suggesting that high lignin con- 
tent inhibits growth of P. cinnabarinus in the culture 
conditions tested. This correlates with very poor growth 
on kraft lignin as sole substrate. As lignin may not be 
sole carbon source, the poor growth could be related to 
the possible impurities from lignocellulose introduced 
during lignin preparation. Growth is better on galacto- 
mannan (guar gum) than on xylan, suggesting a better 
mannan degrading system. Endomannanases are found 
in GH5 and GH26, but there are no family GH26 mem- 
bers in the P. cinnabarinus genome, suggesting that the 
good growth is mainly due to the GH5_7 endomannanase, 
together with the three GH2 p-mannosidases and one 
GH27 a-galactosidase. Inulin- and sucrose- degrading en- 
zymes are found in GH32, but only one member of family 
GH32 is found in the P. cinnabarinus genome. Consider- 
ing that growth on sucrose is significantly better than 
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Table 4 Comparison of the CAZy repertoire identified in the selected white-rot and brown-rot fungal genomes 



CAZy 
families^ 


Aude 


Cesu 


Disq 


Fome 


Galu 


Gano 


Mean 


Phch 


Rust 


Pyci 


Scco 


Sthi 


Trve 


Copu 


Dac 


Fopi 


GItr 


PopI 


Sela 


Woco 


GH1 


1 


3 


4 


5 


3 


3 


2 


2 


1 


1 


3 


3 


2 


3 


1 


2 


5 


2 


3 


1 


GH2 


7 


4 


4 


2 


3 


3 


3 


2 


4 


3 


4 


3 


5 


5 


3 


4 


4 


3 


3 


3 


GH3 


14 


6 


8 


8 


12 


13 


12 


11 


14 


7 


12 


17 


13 


13 


9 


12 


11 


6 


11 


8 


GH5 


43 


18 


19 


20 


19 


18 


16 


19 


18 


17 


16 


20 


22 


21 


24 


19 


19 


17 


20 


18 


GH6 


2 


1 


1 


2 


1 


1 


1 


1 


1 


1 


1 


1 


1 


2 


0 


0 


0 


0 


1 


0 


GH7 


8 


3 


4 


2 


3 


3 


1 


9 


5 


3 


2 


3 


4 


2 


0 


0 


0 


0 


0 


0 


GHIO 


4 


6 


5 


4 


7 


10 


2 


6 


5 


2 


5 


6 


6 


3 


3 


2 


3 


3 


1 


4 


GH11 


3 


1 


0 


0 


0 


0 


0 


1 


1 


0 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


GH12 


1 


2 


3 


3 


3 


3 


4 


2 


2 


3 


1 


5 


5 


4 


1 


2 


2 


2 


1 


2 


GH13 


10 


7 


10 


6 


9 


8 


8 


9 


10 


7 


13 


14 


7 


6 


11 


7 


9 


7 


7 


11 


GH15 


2 


3 


2 


1 


3 


3 


5 


2 


4 


1 


3 


3 


4 


2 


2 


4 


2 


2 


2 


2 


GH26 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


4 


0 


0 


0 


0 


0 


GH27 


5 


4 


6 


4 


6 


3 


4 


3 


5 


1 


1 


5 


4 


4 


2 


4 


3 


3 


3 


3 


GH28 


14 


6 


7 


17 


13 


10 


8 


4 


13 


4 


3 


17 


11 


13 


6 


12 


10 


8 


7 


9 


GH29 


3 


0 


0 


0 


0 


0 


2 


0 


1 


0 


2 


4 


0 


4 


2 


0 


1 


0 


1 


0 


GH31 


11 


5 


6 


5 


6 


7 


10 


6 


8 


5 


4 


8 


5 


12 


6 


5 


5 


4 


5 


5 


GH32 


2 


0 


2 


0 


1 


1 


1 


0 


1 


1 


1 


1 


3 


2 


1 


3 


1 


0 


0 


0 


GH35 


6 


1 


3 


2 


10 


7 


4 


3 


4 


2 


4 


7 


2 


2 


1 


2 


2 


1 


3 


2 


GH36 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


GH43 


28 


2 


7 


7 


1 1 


12 


4 


4 


7 


2 


19 


12 


3 


6 


5 


7 


6 


1 


2 


1 


GH45 


2 


2 


1 


0 


2 


2 


2 


2 


1 


1 


1 


1 


2 


1 


1 


1 


1 


0 


0 


0 


GH51 


3 


2 


2 


1 


2 


2 


1 


2 


3 


0 


2 


3 


2 


3 


2 


4 


4 


1 


1 


4 


GH53 


1 


6 


1 


1 


1 


1 


1 


1 


2 


0 


1 


2 


1 


1 


1 


1 


2 


1 


1 


1 


GH54 


0 


0 


0 


0 


0 


0 


0 


0 


2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


GH62 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


GH74 


1 


1 


1 


4 


1 


1 


1 


4 


2 


1 


1 


2 


1 


0 


0 


0 


1 


0 


1 


0 


GH78 


4 


1 


5 


2 


5 


4 


2 


1 


7 


1 


3 


3 


3 


2 


0 


3 


2 


3 


2 


3 


GH88 


2 


1 


1 


2 


1 


1 


1 


1 


1 


2 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


GH93 


1 


0 


1 


0 


2 


2 


0 


0 


1 


0 


2 


1 


0 


1 


0 


0 


0 


0 


0 


0 


GH95 


1 


1 


1 


2 


1 


2 


1 


1 


1 


1 


2 


1 


1 


1 


0 


1 


1 


1 


1 


1 


GH105 


3 


0 


1 


1 


1 


1 


2 


0 


2 


0 


2 


2 


1 


0 


0 


2 


2 


0 


0 


0 


GH115 


2 


2 


2 


3 


4 


3 


1 


1 


1 


2 


2 


2 


2 


2 


2 


1 


2 


1 


1 


2 


GH131 


2 


1 


3 


2 


3 


3 


2 


3 


2 


3 


2 


3 


3 


2 


1 


1 


1 


0 


2 


0 


Total GH 


186 


89 


110 


106 


133 


127 


101 


100 


129 


71 


116 


151 


114 


118 


90 


100 


100 


67 


80 


81 


PL1 


2 


0 


0 


2 


0 


0 


2 


0 


4 


0 


5 


4 


0 


0 


0 


0 


0 


0 


0 


0 


Dl 


1 


U 


U 


U 


U 


U 


U 


U 


U 


U 


/I 

4 


U 


U 


U 


U 


U 


U 


U 


U 


U 


PL4 


1 


0 


1 


0 


0 


0 


1 


0 


3 


1 


3 


3 


1 


0 


0 


0 


2 


0 


0 


0 


PL9 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Total PL 


4 


0 


1 


2 


0 


0 


3 


0 


7 


1 


13 


7 


1 


0 


0 


0 


2 


0 


0 


0 


CEl 


4 


2 


0 


0 


2 


2 


1 


4 


2 


3 


11 


1 


3 


0 


0 


0 


1 


0 


0 


0 


CE8 


2 


2 


3 


3 


3 


3 


3 


2 


6 


1 


2 


5 


2 


2 


3 


2 


2 


2 


2 


1 


CEl 2 


2 


0 


2 


2 


1 


1 


2 


0 


1 


0 


2 


3 


0 


0 


0 


0 


0 


0 


0 


0 


Total CE 


8 


4 


5 


5 


6 


6 


6 


6 


9 


4 


15 


9 


5 


2 


3 


2 


3 


2 


2 


1 
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Table 4 Comparison of the CAZy repertoire identified in the selected white-rot and brown-rot fungal genomes 

(Continued) 



CBMl 


43 


17 


17 


6 


14 


18 


17 


30 


21 


17 


5 


17 


23 


2 


1 


0 


1 


0 


8 


0 


CBM5 


8 


3 


5 


5 


10 


9 


5 


3 


4 


3 


3 


10 


6 


11 


1 


4 


4 


5 


5 


4 


CBM12 


2 


1 


1 


0 


1 


1 


1 


0 


0 


1 


1 


1 


1 


1 


0 


1 


1 


1 


1 


1 


CBMl 3 


4 


6 


11 


4 


9 


8 


1 


5 


7 


4 


17 


4 


6 


6 


2 


11 


1 


15 


5 


5 


CBMl 8 


1 


1 


1 


0 


2 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


0 


1 


1 


1 


CBM20 


4 


4 


2 


2 


3 


3 


3 


2 


4 


1 


1 


5 


4 


2 


1 


2 


2 


1 


2 


1 


CBM21 


3 


2 


2 


2 


2 


2 


2 


2 


3 


2 


3 


1 


2 


2 


3 


2 


2 


2 


1 


1 


CBM35 


2 


0 


1 


1 


0 


1 


2 


1 


1 


0 


1 


2 


1 


0 


1 


1 


2 


1 


2 


1 


CBM38 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


CBM42 


4 


0 


0 


0 


0 


0 


0 


0 


2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


CBM43 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


2 


1 


1 


1 


1 


1 


2 


1 


1 


CBM48 


3 


2 


3 


3 


3 


3 


2 


1 


3 


2 


4 


3 


3 


3 


2 


3 


3 


3 


2 


3 


CBM50 


21 


1 


2 


2 


8 


10 


1 


1 


10 


1 


5 


1 


1 


1 


1 


15 


2 


4 


1 


0 


CBM63 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Total CBM 


96 


38 


46 


26 


53 


57 


36 


47 


57 


33 


43 


47 


49 


31 


14 


41 


19 


35 


29 


18 


AA1_1 


0 


7 


11 


10 


13 


16 


14 


0 


12 


5 


2 


15 


7 


6 


0 


5 


4 


2 


4 


3 


AA1_2 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


0 


2 


2 


1 


2 


1 


1 


1 


1 


1 


AA1_3 


5 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


AA2 


18 


16 


12 


17 


8 


9 


7 


16 


11 


9 


0 


6 


26 


0 


0 


0 


0 


0 


0 


0 


AA3_1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


0 


0 


1 


0 


2 


0 


AA3_2^ 


3 


6 


11 


2 


4 


8 


13 


3 


6 


4 


2 


15 


5 


0 
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^Selection of the CAZy families involved in plant cell wall degradation. Full list of CAZy families are provided in Additional file 1: Table SI. 
'^AA3_2 includes only models with similarity to aryl-alcohol oxidase and glucose oxidase 

Species list; Auricularia delicata (Aude); Ceriporiopsis subvermispora (Cesu); Dichomitus squalens (Disq); Fomitiporia mediterraneo (Fome); Gonodermo lucidum (Galu); 
Ganoderma sp. (Gano); Heterobasidion annosum (Mean); Phanerochaete chrysosporium (Phch); Punctularia strigosozonata (Past); Pycnoporus cinnaborinus (Pyci); 
Schizophyllum commune (Scco); Stereum hirsutum (Sthi); Trametes versicolor (Trve); Coniopliora puteana (Copu); Dacryopinax sp. (Dac); Fomitopsis pinicola (Fopi); 
Gloeopliyllum trabeum (GItr); Postia placenta (PopI); Serpula lacrymans (Sela); Wolfiporia cocos (Woco). 



growth on inulin, this gene probably encodes an invertase 
rather than an inulinase. These growth profiling studies 
estimated growth speed by measuring the diameter of the 
on-plate fungal mycelium on plates. However, growth is 
also related to density of mycelium with dense, medium 
and thin mycelia on the plates. The fast growth on poor 
media, especially when no carbon source is added, could 
also be due to thin mycelial expansion to avoid starvation. 



Gene structure and localization of the ligninolytic repertoire 
in P. cinnabarinus 

Descriptions of the P. cinnabarinus laccases (AA1_1) 

Five laccases stricto sensu (AA1_1), one multicopper oxi- 
dase (Mco, AAl) and one ferroxidase (AA1_2) sequence 
were identified in the genome and in the cDNA library, 
even partially (Additional file 3: Table S3). Structural anno- 
tation of the genes (designated lad to lacS) was performed. 
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and the gene lcc3-l 15 (or lad) coding for Lad protein 
was identified [17,67]. In 2000, Otterbein et al [68] demon- 
strated the presence of a second laccase isoenzyme, called 
Lac2, in the culture medium of P. cinnabarinus BRFM137. 

Lac2 was purified and its A/-terminal sequence was de- 
termined [68]. However, the corresponding gene has never 
been identified and cloned, and the biochemical properties 
of the Lac2 protein have never been determined. Based on 
the A^-terminal sequence, we were able to determine the 
corresponding gene sequence, named lac2, from the 
strain BRFM137 genome sequencing data (Additional 
file 4: Table S4). The five laccase-encoding genes have 
a size of about 2.1-2.3 kb interrupted by 10 to 12 in- 
trons. Based on the intron and exon positions of each 
gene, we were able to classify the various laccase genes 
into three groups (Additional file 5: Figure SI). Lac2/ 
lacS (12 introns) and lacl/lacS (10 introns) pairs have 
a similar structural organization with homologous in- 
tron positions, whereas the lac4 gene is organized slightly 
differently (length of exons and introns). The lac4 gene 
comprised 11 exons but showed a slightly different struc- 
ture from lad and lacS^ and an experimentally-found 
stop codon was confirmed in exon 6 (Additional file 5: 
Figure SI). In contrast to other laccase-encoding genes, 
the full-length lac4 mRNA could not be found. The multi- 
plicity of laccase genes and their groupings are common 
features in fungi and are discussed in Additional file 6: 
Data SI [69-80]. In the P, cinnabarinus BRFM137 gen- 
ome, several laccase-encoding genes were identified on 
the same scaffold. For instance, the lad and lacS genes 
were separated by approximately 23 kb in the same read- 
ing frame on scaffold 185007. 



Descriptions of the P. cinnabarinus ligninolytic peroxidases 
(AA2) 

We have shown that the P. cinnabarinus genome en- 
codes a large set of ligninolytic peroxidases of family 
AA2. Nine full-length AA2 sequences were detected 
from the genomic DNA of P, cinnabarinus BRFM137 
(Table 3). After an initial automatic classification as LiPs 
and MnPs, they were manually reclassified following the 
strategy described by Ruiz-Duenas et al. [81] for manual 
annotation of the complete inventory of heme peroxi- 
dases of Pleurotus ostreatus. This protocol was based on 
a combined analysis of the deduced amino acid se- 
quences and structural homology models obtained using 
the crystal structures of related enzymes as templates. 
The identified members of family AA2 share common 
structural features, including four disulfide bridges and 
residues coordinating two calcium ions, a proximal histi- 
dine (acting as fifth heme iron ligand), and distal histidine 
and arginine residues (involved in enzyme activation by 
hydrogen peroxide), as shown in Figure 6. The presence 
of specific catalytic residues [82] allowed us to classif)^ the 
nine members of family AA2. Firstly, three short MnPs 
(Figure 6A-C) characterized both by the presence of a 
manganese oxidation site formed by two glutamates and 
one aspartate at the internal heme propionate region, and 
by a shorter C-terminal tail than that of long and extra- 
long MnPs [6]. Secondly, four LiPs (Figure 6D-G) contain- 
ing a 174-Trp residue exposed to the solvent responsible 
for oxidation of high-redox-potential aromatic com- 
pounds. Thirdly, one VP (Figure 6H) including both a 
catalytic Trp residue exposed to the solvent and a manga- 
nese oxidation site; fourth, one atypical VP (Figure 61) 
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differing from VPs in one of the three acidic residues of 
the manganese oxidation site. A partial sequence for the 
first 138 amino acids of the A/^- terminal end of an add- 
itional putative class II peroxidase was also identified and 
could be hypothetically annotated as a LiP6. The above set 
of AA2 peroxidases identified in P. cinnabarinus is close 
to that identified in Trametes versicolor (in both cases 
consisting of MnP, LiP, VP and atypical- VP) [84], although 
the total number of sequences is lower in Pycnoporus, 
Two genes encoding heme peroxidases of a recently dis- 
covered superfamily of heme-thiolate peroxidases (HTP) 
[85] were also identified in P, cinnabarinus. 

These peroxidases are widely distributed in fungal ge- 
nomes, including those from soft-rot, brown-rot and 
white-rot fungi [6,84,86,87]. However, only a few of them 
have so far been studied, with those from Leptoxyphium 
fumago and Agrocybe aegerita being the best character- 
ized. They are known to catalyze halogenation reactions 
and to possess catalase, peroxidase and peroxygenase ac- 
tivities [88]. Consequently, similar reactions are expected 



to be catalyzed by the HTPs identified in the P. cinna- 
barinus genome sequence. 

All lip and mnp genes except MnP2 and LiP6 were also 
found in the cDNA library (Additional file 3: Table S3). 
The mnp genes present lengths of 1.4-1.5 kb, (Additional 
file 7: Table S5) and and count 4-6 introns according to 
gene. The two genes encoding VP and the four LiPs 
showed relatively similar sizes (about 1.45 kb) and were 
interrupted by six introns, for coding sequences of similar 
length (about 1.1 kb). 

Considering the analysis of the intron/exon struc- 
ture, a division of family AA2 into several subgroups 
could be proposed, vp and lip genes share a similar struc- 
tural organization and form one group (Additional file 8: 
Figure S2 A), whereas mnp genes are a more heteroge- 
neous group in terms of gene structure, i.e. exons 2 and 3 
of the mnp2/mnp3 pair merge into a single exon in mnpl 
while exons 3, 4 and 5 of the mnpl/mnp2 pair correspond 
to a single exon in mnpS, Finally, atypical-vp gene was to- 
tally different in length (1728 bp), number and structure 
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Figure 6 Molecular models for the nine class-ll heme peroxidases (AA2) found in the P. cinnabarinus genome. MnP models (A-C) present 
a Mn^^ oxidation site cliaracteristic of typical MnPs, formed by two glutamates and one aspartate at the internal heme propionate region; LiP 
models (D-G) exhibit a Trp residue exposed to the solvent, which has been involved in high-redox-potential aromatic compound oxidation by 
typical LiPs; the VP model (H) obtained for the only peroxidase of this family identified in the genome analysis evidences both the Mn^^ oxidation 
site and the Trp residue exposed to the solvent, characteristic of members of this class-ll family; the atypical VP (I) contains an aspartate residue 
(Asp36) in a position occupied by a glutamate in VPs and MnPs. Two axial histidines, one acting as heme iron ligand (proximal histidine) and the 
second (distal histidine) contributing to the heme reaction with peroxide, together with an arginine residue characterizing class-ll peroxidases are 
also shown in the nine molecular homology models. Four disulfide bridges are depicted as green sticks. These homology models were obtained 
at the Swiss-Model protein-homology server [83] using P. eryngii \/PL (PDB entries 4FCS, 2VKA and 3FJW) and P. chrysosporium LiPH2 and LiPH8 
(PDB entries ILLP, IBBO and 1B82) crystal structures as templates. 



of exons/introns compared with the other class II peroxid- 
ase genes analyzed (Additional file 8: Figure S2 A). 

In the genome of P. cinnabarinus, we noted that some 
class II peroxidase genes were grouped on the same scaf- 
fold, forming a cluster of peroxidases. This was the case 



for mnp3, liph lip2 and lip3 genes, each separated by 
about 2 kb and oriented in the same transcriptional dir- 
ection on the 184983 scaffold. Johansson and Nyman 
[89] had already described in T, versicolor, a similar clus- 
ter of three genes encoding two LiPs (LPGIII, LPGIV) 
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and one MnP (MPGl) in a genomic region of 10 kb, ori- 
ented in the same transcriptional direction and sepa- 
rated by approximately 2.4 kb. In addition, the intron/ 
exon organization of these T. versicolor genes pointed to 
a similar structure for the two LPGIII and LPGIV (about 
1470 bp in length, including six introns), whereas the 
MPGl gene was slightly different (1400 bp interrupted 
by five introns). 

After analyzing the recently-sequenced T. versicolor 
genome sequence [6], we identified an additional lip 
gene (1441 bp in length, including six introns) 6.8 kb 
upstream of the above sequences, completing the same 
cluster of three lip and one mnp genes as that observed 
in P. cinnabar inus. Compared with other class II peroxi- 
dases (see the dendrogram in Figure 7), these sequences 
appear closely related to those located at the same posi- 
tions in the cluster identified in P, cinnabarinus {mnpSI 
mnp2y liplllipl2, Iip2llip2 and lipSllipl in P, cinnabar i- 
nusIT, versicolor). The co-localization of these genes in 
both genomes suggests they may occupy a large ortholo- 
gous genomic region that has been preserved in these 
two closely-related species sharing a common ancestor 
[84]. htpl, htp2 and lip6 genes also clustered on scaffold 
184962 at 7.9 kb [htpl and htp2) and 34.1 kb {htp2 and 
lip6) apart (Additional file 8: Figure S2 B). Similarly, two 
of the three htp genes identified in T, versicolor, only 
1.2 kb apart, form a cluster on scaffold 12 but are ar- 
ranged in the same transcriptional direction, whereas 
those from P, cinnabarinus are found in a transcription- 
ally convergent orientation, and the nearest class II gene 
is located 64 kb away. This suggests that unlike what is ob- 
served for mnpSy lipl, lip2 and lip3 genes, the organization 
of htp genes does not appear to be conserved between 
these two species of the core polyporoid clade. Almost all 
of these peroxidase genes were transcribed, as they were 
recovered in the P, cinnabarinus BRFM137 cDNA library 
(Additional file 3: Table S3). The cloning of partial ///5?-like 
genes is described in Additional file 9: Data S2 [90-93]. 

Figure 7 provides a dendrogram showing sequence rela- 
tionships between 223 protein sequences of basidiomycete 
class II peroxidases [6], including those identified in the 
genome of P. cinnabarinus. Five peroxidase groups can 
be distinguished. Cluster A consists of 39 short MnPs 
where the three P. cinnabarinus MnPs appear closely 
related to seven of the 12 short MnPs identified in the 
r. versicolor genome sequence [84], and relatively dis- 
tant from the 11 VPs from P, eryngii, P. ostreatus, P, 
pulmonariusy P. sapidus, B, adusta and Spongipellis sp. 
also included in this cluster. A well-defined cluster B 
contains all the UP (45) sequences, including the 4 LiPs 
from P. cinnabarinus closely related to the 10 LiPs identified 
in T. versicolor, as well as the only P. cinnabarinus VP 
grouped together with the two other VPs (from T. versicolor 
and Ganoderma sp.) contained in this cluster. Cluster C 



consists of 16 short MnPs, four VPs and seven atypical VPs, 
plus the only atypical VP identified in P. cinnabarinus which 
is grouped together with VPs and atypical VPs from other 
species {T. versicolor, D, squalens and different Ganoderma 
species), all of them clustered together with P, cinnabarinus 
within the core polyporoid clade. The clearly-differentiated 
cluster D is composed of intermixed long and extralong 
MnPs absent in P, cinnabarinus and characterized by the 
presence of 10-20 and 20-30 extra amino acid residues at 
the C-terminal end, respectively (compared with short 
MnPs), and by containing one more disulfide bridge than 
LiPs, short MnPs and VPs (and their atypical variants). 
Different groups of generic peroxidases (GP) and atypical 
MnPs (not identified in P, cinnabarinus) are located next 
to the root of the dendrogram in the cluster D. 

Descriptions of other AA proteins involved in ligninolysis 

Other putative AA proteins produce the hydrogen perox- 
ide necessary for the catalytic cycle of hydrogen peroxide- 
dependent fungal peroxidases (LiP, MnP, VP). The ability 
of hydrogen peroxide to generate hydroxyl radicals (OH*) 
also points to another role of hydrogen peroxide in the 
biodegradation of wood, where these hydroxyl radicals 
(OH*) could initiate the attack of lignocellulose [94]. For 
these reasons, research into hydrogen peroxide-producing 
enzymes - especially AA3_2 (aryl alcohol oxidases) and 
AA5_1 (glyoxal oxidases) - has surged. The subfamily 
AA5_1 contains glyoxal oxidases (called Glox) and copper 
radical oxidases (called Cro), which are enzymes related to 
glyoxal oxidases containing conserved active site residues 
but that diverge in terms of other structural features [95] . 
In P. cinnabarinus, seven AA5_1 enzymes have been iden- 
tified in P, cinnabarinus BRFM137, including three glyoxal 
oxidases stricto sensu and four "radical copper oxidases" 
(Additional file 10: Table S6). Furthermore, these three 
glox and four cro were also expressed in the cDNA library. 

Glox and Cro encoding genes (AA5_1) have diverse char- 
acteristics in P, cinnabarinus. The gene sizes ranged from 
1.85 to 4.45 kb and were interrupted by one to 22 introns, 
corresponding to coding sequences ranging from 1.6 to 
3 kb (Additional file 11: Table S7). The structure of the 
gene called cro2 stands out 19 from the others, with a large 
number (22) of introns. In contrast, the sequences identi- 
fied as glox sensu stricto share comparable size (1.85 kb) 
and structure (three introns) and form a homogeneous 
group. Based on the analysis of the intron/exon structure 
of each Pycnoporus AA5_1 encoding gene (Additional 
file 12: Figure S3 A), we could propose dividing AA5_1 
into three subgroups corresponding to: (i) the glox se- 
quences, which had strong intron position homology, 
(ii) the crol, cro3 and cro4 sequences, and (Hi) the very 
different cro4 sequence. Moreover, the three glox genes 
formed a cluster oriented in the same transcriptional 
direction and grouped on the same scaffold, with glox2 
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(See figure on previous page.) 

Figure 7 Dendrogram of 223 sequences of class-ll basidiomycete heme peroxidases (AA2) showing the position of nine sequences 
from the P. cinnabarlnus genome (orange baclcground). Evolutionary analysis was performed with MEGA5 using Poisson distances and an 
unweighted pair group method with arithmetic mean clustering. The cytochrome c peroxidase from P. ostreotus, monokaryon PC9, was used to 
root the tree (http://phylobench.vital-it.ch/raxml-bb/). The dendogram was used to illustrate the clustering of sequences (clusters A to E). Clusters 
with no P. cinnoborinus sequences included were collapsed. Most of the sequences were obtained from the analysis of fungal genome 
sequences deposited at the US Department of Energy Joint Genome Institute (JGI), with the rest collected from GenBank [86]. Fungal 
abbreviations are as follows: BJEAD, Bjerkondero adusta (JGI); DICSQ, Dichomitus squalens (JGI); FOMME, Fomitiporia mediterranea (JGI); GANSP, 
Gonodermo sp. (JGI); HETAN, Heterobosidion onnosum (JGI); PHLBR, Phlebio brevisporo (JGI); PYCCI, Pycnoporus cinnabarinus; STEHI, Stereum hirsutum 
(JGI); TRACE, Trometopsis cervino; and TRAVE, Trometes versicolor (JGI). Other fungal species with peroxidase sequences included in the collapsed 
clusters are: Agaricus bisporus (JGI), Auricularia delicata (JGI), Bjerkondero sp (JGI), Ceriporiopsis rivuloso, Coprinellus disseminotus, Coprinopsis cinereo 
(JGI), Fomitopsis pinicolo (JGI), Gonodermo opplonotum, Gonodermo oustrole, Gonodermo formosonum, Gonodermo lucidum, Gelotoporio 
subvermisporo (JGI), basidiomycete IZU-154, Loccorio bicolor (JGI), Lentinulo edodes, Phonerochoete chrysosporium (JGI), Phonerochoete sordido, 
Phlebio rodioto, Pleurotus eryngii, Pleurotus ostreotus (JGI), Pleurotus pulmonorius, Pleurotus sopidus, Punctulorio strigosozonoto (JGI), Rhodonio 
plocento (JGI), Spongipellis sp., Toiwonofungus comphorotus, Wolfiporio cocos (JGI). 



and glox3 separated by only 1.1 kb (Additional file 12: 
Figure S3 B). This type of organization has also been 
found for the genes named cro3, cro4 and croS in P. 
chrysosporium ([95]; Additional file 13: Data S3 [95-97]). 
Additional file 14: Figure S4 and Additional file 15: 
Data S4 report the structural comparison between the 
Gloxl protein sequence from P, cinnabarinus and that of 
Gaox (PDB reference IGOG) [97]. 

Secretome analyses and lignocellulosic degradation 

Several recent studies have shown that the diversity 
(number and type) of hemicellulolytic and ligninolytic 
enzymes or isoenzymes produced by basidiomycetes de- 
pends on substrate used and mode of cultivation (liquid 
culture (LC) or solid-state fermentation (SSF)) [98-102]. 
Agro-residues such as fruit peels (banana, mandarin, 
melon, peach and apple peels) are rich in cellulose, 
hemicellulose, lignin, soluble sugars and aromatic com- 
pounds, and were found to be substrates favoring the pro- 
duction of glycoside hydrolases and laccases in white-rot 
basidiomycetes [99]. Lignocellulosic residues such as 
straw, bran and wood chips favor the peroxidase pro- 
duction by most basidiomycetes [99]. LC promotes the 
production of laccases and hydrolases while SSF promotes 
the production of peroxidases, including MnPs [101,102]. 
We thus ran several P, cinnabarinus BRFM137 cultures via 
both LC and SSF in presence of simple or complex "nat- 
ural" substrates to compositionally analyze the correspond- 
ing secretomes (Additional file 16: Table S8). 

Analysis of the P, cinnabarinus secretomes detected 
184 proteins in LC-M (maltose), 166 proteins in LC-B 
(maltose and micronized birchwood), 121 proteins in 
LC-M-MB-A (maltose, maize bran, Avicel), and 139 pro- 
teins in SSF cultures. Most of the secreted proteins in 
our culture conditions consisted of carbohydrate-active 
enzymes (CAZymes), which represented 55% and 52% of 
the total proteins detected in LC-M-MB-A and SSF, re- 
spectively, and 41% and 47% in LC-M and LC-B, respect- 
ively (Additional file 16: Table S8). CAZyme distributions 



were compared according to the different cultures condi- 
tions (Figure 8). Interestingly, the LPMOs of family AA9 
were only identified in the conditions including complex 
substrates, and no AA9 protein was found in the control 
condition with maltose. Moreover, different AA9 proteins 
were produced in response to different growth conditions. 
For instance, three AA9 proteins were produced only with 
birchwood, whereas two different AA9 proteins were 
identified in cultures with maize bran and Avicel. This re- 
sult indicates that there is a differential regulation of the 
LPMO genes that is dependent on growth substrates and/ 
or on temporal scale. Indeed, the AA9-encoding genes 
may also be constrained by strict short expression during 
substrate-supported fungal growth. In recent studies, a 
preponderance of AA9 was produced exclusively in sugar 
beet pulp conditions [103]. The detailed distribution of 
the (hemi)cellulolytic and ligninolytic proteins detected in 
secretomes is described in Additional file 16: Table S8. 
Interestingly, all the representatives of the ligninolytic AA 
families were identified in these conditions, although with 



LC B 



LC M, MB, A 



LC M 




Figure 8 Venn diagram showing CAZyme distributions among 
the P, cinnabarinus secretomes from different growth 
conditions. LC: liquid culture, B: birchwood, M: maltose, M-MB-A: 
maltose + maize bran + Avicel, SSF: solid-state fermentation. 
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different distribution patterns depending on growth con- 
ditions (Additional file 17: Table S9). Three AA1_1 lac- 
cases (scfl84817_g29; scfl85007_gl00; scfl85007^107) 
were identified in all conditions studied here, demonstrat- 
ing that these enzymes are widely and constitutively pro- 
duced by the fungus. Contrary to laccases, members 
belonging to the class II peroxidases (AA2) were only 
identified in the secretomes from SSF cultures (one Lip 
and one MnP) and in LC-M (atypical- VP). Despite the 
major role of family peroxidases AA2 in lignin degrad- 
ation, no AA2 protein was detected in the conditions 
using the hardwood substrate (birchwood). The class II 
peroxidases could be constrained by a fine-tuned regu- 
lation or, alternatively, be not produced in our growth 
conditions. The expression and regulation of class II 
peroxidase-encoding genes depend on environmental 
signals such as concentration of carbon and nitrogen, 
exposure to metal ions and xenobiotics, temperature 
shock, and daylight [104]. 

A number of cellulolytic enzymes were produced in all 
conditions studied. For instance, secretomes contained 
members of the families GH3, GH6, GH7 and GH12, 
which are principally involved in cellulose breakdown. 
However, the endo-|3-l,4-glucanases of the subfamily 
GH5_5 were only produced when birchwood was used 
in the culture medium. We also identified a number of 
xylan-degrading enzymes produced only in the LC-M- 
MB-A (maltose, maize bran, Avicel), including members 
of families CEl, CE15, GH3, GH5, GHIO. Moreover, 
family CEl members were only found when maize bran 
was used in the cultures. Among the known activities in 
the CEl family, feruloyl esterase activity mobilizes key 
enzymes acting on ferulic and diferulic acid bridges em- 
bedded in the hemicellulose from plant cell walls [105]. 
Maize fiber xylan features among the most complex het- 
eroxylans and is highly substituted by feruloylated branches 
yielding a large in ferulic acid content of up to 3% of the 
dry mass [106]. Thus, the breakdown of this substrate re- 
quired varied enzymes, as suggested by the diversity of 
xylanolytic enzymes produced by P. cinnabarinus in pres- 
ence of maize bran. 

Protein secretion and glycosylation pathways 

The main lignocellulolytic enzymes of P. cinnabarinus 
are extracellular, and the proteins are secreted and proc- 
essed during secretion by the secretion and glycosylation 
systems of the fungus. Analysis of the genes involved in 
protein secretion and glycosylation shows that P, cinna- 
barinus contains the entire machinery needed for protein 
secretion via the classical secretory pathway (Additional 
file 18: Table SIO and Additional file 19: Table Sll, re- 
spectively). Transport of secretory proteins is expected to 
take place both via a pathway dependent on a signal rec- 
ognition particle (SRP) and via an SRP-independent 



pathway, as genes for both pathways were identified. 
Protein transport from one compartment to the next 
in the secretory pathway is carried out by various pro- 
tein complexes, such as the COPI/COPII complexes. 
Transport Protein Particule (TRAPP) complex and the 
exocyst complex (Additional file 18: Table SIO). 

The genome contains homologs of subunits in these 
complexes and indicates that the complexes are highly 
conserved in P, cinnabarinus. We also screened for V- and 
T-SNAREs (Soluble NSF Attachment Protein Receptors) in 
the genome and for secretion-related GTPases. Both the 
SNARE proteins and secretion-related GTPases are ex- 
pected to function at discrete steps in the secretory 
pathway, and for most proteins we were able to iden- 
tify a bi-directional best hit, indicating conservation of 
these functions, probably at the same step along the 
secretory pathway. 

The endoplasmic reticulum (ER) is an important organ- 
elle that harbors the enzymes required for proper folding 
of secretory proteins. P. cinnabarinus is fully equipped 
with the enzymes needed for protein folding and disulfide 
bridge formation (Additional file 18: Table SIO). The ma- 
chinery to deal with misfolded or unfolded proteins (the 
Unfolded Response Pathway (UPR)) is also conserved, al- 
though we were unable to identify a clear ortholog of the 
Hacl/XBPl transcription factor in the P. cinnabarinus 
genome. Hacl/HacA (in fungi) or Xbpl (mammalian 
cells) is a bZIP transcription factor that is uniquely acti- 
vated by an unconventional splicing event mediated by 
Irelp (acting as sensor and endonuclease) and Trllp (act- 
ing as ligase) [107]. The presence of proteins involved in 
Had activation, such as the sensor (Irelp) and the tRNA 
ligase (Trllp), in the P. cinnabarinus genome suggests 
that this same UPR mechanism via HacA activation is also 
present in P. cinnabarinus. The removal of misfolded pro- 
tein via the ER-associated degradation (ERAD) system, 
which targets misfolded proteins for degradation in the 
proteasome, is conserved, since we identified orthologous 
proteins to the ERAD and proteasome (Additional file 18: 
Table SIO). 

We also analyzed the presence of genes related to post- 
translational modifications in the secretory system includ- 
ing protein N- and 0-glycosylation as well as glycosylpho- 
sphatidylinositol (GPI)-anchor biosynthesis, (Additional 
file 19: Table Sll). The biosynthetic genes required for 
the formation of nucleotide sugar GDP-mannose, UDP- 
glucose, UDP-A^-acetylglucosamine and UDP-galactose 
together with transporter to localize the nucleotide 
sugars in the ER or Golgi lumen were identified. The 
genes encoding the proteins for stepwise synthesis of the 
dolicholphosphate-linked oligosaccharide (ALG genes; 
asparagine (N) -linked- glycosylation) as well as the 
transfer of the oligosaccharide to asparagine residues 
(OST-complex) are conserved and found in the genome 
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of P. cinnabar inus. Similarly, genes homologous to the at- 
tachment of the mannose residue to serine or threonine 
residues (0-linked glycosylation), which are carried out by 
protein mannosyl transferase (PMT), are also conserved. 
Like in other fungi, P, cinnabarinus has a genome that 
contains multiple PMT homologs. Glycosylphosphatidyli- 
nositol (GPI) -anchor biosynthesis and transfer of the pre- 
assembled GPI anchor also takes place in the ER. Most of 
the genes involved in GPI-anchor biosynthesis were iden- 
tified. The genes encoding Golgi-localized proteins that 
are involved in outer chain elongation (Ochlp/Mnn9p 
mannosyltransferase complexes) are not present in the 
P. cinnabarinus genome. 

The genes encoding proteins that are expected to add the 
second and third mannosyltransferase to O-chains are 
present, but genes homologous to a-l-3-mannosyltransferase 
that add the fourth or fifth mannose to O-chains were not 
identified. Thus the post-transcriptional glycosylation events 
in the Golgi appear to be severely curtailed in P, cinnabari- 
nus to much the same extent as previously reported for the 
basidiomycete S. commune [61,108]. Galactofuranosylation 
is a type of modification found on glycoproteins in Aspergil- 
lus species [109], but the genes involved in this process are 
absent in P. cinnabarinus. This raises prospects for using 
P. cinnabarinus to produce pharmaceutical proteins, as 
the glycostructures (A^- and O-chains) have a mammalian- 
like structure and are devoid of the highly antigenic 
galactofuranose residues found in expression hosts such 
as A. niger see [109] for review. 

Mating-type loci and their genes in P. cinnabarinus 

In the past, the fungal lifecycle of P. cinnabarinnus was 
studied in order to select monokaryotic lines with char- 
acteristics specifically tied to lignocellulose degradation 
[17]. Pycnoporus species are heterothallic Agaricomy- 
cetes with two mating type loci controlling the fungal 
lifecycle [8,110]. One mating type locus {A locus) in the 
tetrapolar Agaricomycetes encodes two types of homeodo- 
main transcription factors (HDl and HD2) in divergently- 
transcribed gene pairs, whereas the other {B locus) contains 
genes for pheromones and pheromone precursors, respect- 
ively [111,112]. 

The A mating type locus 

HDl and HD2 mating type proteins from P. chrysospor- 
ium (ADN97192.1, ADN97171.1) were successfully used 
to screen the Pycnoporus EST contigs. Pycnoporus, like 
other basidiomycetes [110], has at least one HDl and one 
HD2 gene for homeodomain transcription factors. The HDl 
protein al-1 deduced from contig > GCTO4WP02F0TDF.f. 
pel dnarcontig contig::GCTO4WP02F0TDF.£pc.l:l:2252:l 
is 495 aa long. Its A^-terminal domain is related to the 
A^-terminal of mating type proteins from other species 
(Additional file 20: Figure S5 A) and is expected to act 



in heterodimerization with compatible HD2 proteins 
while discriminating HD2 proteins from the same mat- 
ing type [111]. The two classes of homeodomain pro- 
teins encoded in basidiomycete mating type loci are 
defined by their distinct homeodomain sequences [113]. 
HDl proteins have a TALE-class homeodomain with three 
extra amino acids in-between Helix I and Helix II of the 
three-helical DNA-binding domain. Some amino acid 
exchanges in the conserved DNA-recognition motif 
(WFxNxR) in Helix III are tolerated [112]. In the Pycno- 
porus al-1 protein, the position of the HDl homeodomain 
is only recognized by sequence alignment with related 
HDl proteins from other species (Additional file 20: 
Figure S5 A). The DNA-recognition sequence in Helix 
III is degenerated and Helix II has undergone a dele- 
tion. Previous research failed to find the expected con- 
served HDl motif in respective proteins of Postia placenta 
[4]. We note from other species that a defective HDl 
homeodomain does not inevitably cause loss-of-function in 
mating type regulation provided that the HD2 homeodo- 
main in a HD1-HD2 heterodimer continues to func- 
tion [114]. 

A HD2 gene for the 569 aa-long protein a2-l was 
found on > GCTO4WP02F01PN.f.pc.l dna:contig con- 
tig::GCTO4WP02F01PN.f.pc.l:l:2027:l. The protein has 
a classical 60 amino acid-long homeodomain with all in- 
variant residues in the DNA-binding motif which is highly 
sequence-conserved with HD2 mating type proteins from 
other Agaricomycetes (Additional file 20: Figure S5 B). 

Interestingly, contig GCTO4WP02F0TDF.f.pc.l contains 
not only the full-length coding sequence of protein al-1 
(>scfl85007.g8) but also, downstream on the opposite 
strand, the 3-terminal half of gene ^-fg for an unknown 
fungal protein (>scfl85007.g7), which in most Agaricomy- 
cetes flanks one side of the homeodomain transcript factor 
locus [115]. At the other side of the loci, a mip gene for a 
mitochondrial intermediate peptidase is usually present 
[116]. P. chrysosporium and P. placenta differ from other 
analyzed Agaricomycetes in the relative order of their 
single HDl gene to mip and ^-fg. These are the two 
species where HDl gene neighbors ^-fg and not mip 
and is transcribed in the same direction as mip, sug- 
gesting that there has been an inversion of the mating 
type locus [115,117]. Contig GCTO4WP02F0TDF.f.pc.l 
indicates that Pycnoporus is another species with this 
same inverted arrangement. P. chrysosporium (Phaner- 
ochaetaceae) and P, placenta (Fomitopsidaceae), like 
Pycnoporus (Coriolaceae), belong to the Polyporales, and 
an inversion event early in evolution is likely [118]. 

The B mating type locus 

The bipolar P. chrysosporium contains five genes for 
pheromone receptors, three of which cluster together in a 
locus orthologous to the B mating type locus of tetrapolar 
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species, whereas two others belong to the still-unexplored 
non-mating-type G protein-coupled transmembrane re- 
ceptors of the Agaricales [115,117]. The five P. chrysospor- 
ium proteins were used to screen the Pycnoporus EST 
contigs, and five hits were found. Three models contained 
full-length {PciSTE3.2, PciSTES.S) or nearly complete 
{PciSTESA) ORFs for G protein-coupled transmembrane 
receptors (Additional file 21: Figure S6). The other two 
contained a 5 ' half of a gene (PciSTESN) and a 3 ' half of a 
gene {PciSTESC), respectively, and it is possible that these 
two EST contigs present the same gene (Additional file 21: 
Figure S6). Sequence analysis of the nearly- complete pro- 
teins using ClustalW for alignment (http://www.clustal. 
org/clustal2/), GeneDoc (http://www.psc.edu/biomed/ 
genedoc/) for manual corrections, and the neighbor- 
joining function in MEGA4 software [119] indicates 
that PciSTE3.4 groups with the two non-mating-type G 
protein-coupled transmembrane receptors of P, chrysospor- 
ium (Additional file 22: Figure S7 A), whereas PciSTE3.2 
and PciSTE3.3 cluster with the B-mating-type-orthologous 
receptors, respectively (Additional file 22: Figure S7 B,C). 
This finding suggests that P. cinnabarinus, like other 
tetrapolar Agaricales, has B-mating-type-specific and 
non-mating-type genes for pheromone receptors [112,115]. 
We also analyzed the A/^-terminal ends and the C-terminal 
ends of the proteins separately and together with the pro- 
tein halves deduced from the incomplete EST contigs 
GCTO4WP02FNFO2.f.pc.l and GCTO4WP02F7KNS.f. 
pel, respectively. In both phylogenetic trees, the par- 
tial pheromone receptors group with PciSTE3.2 and 
with the B orthologous PchSTE3.2 of P. chrysospor- 
iuniy which is evidence that the sequences may come 
from the same gene. As in several other species 
[112,115], there are thus at least three expressed paralo- 
gous candidate genes for B-mating-type function in P, 
cinnabarinus, 

Pheromone precursors are short peptide chains of up to 
about 100 aa and the mature pheromones are 9 to 14 aa- 
long peptides, which are difficult to find in BLAST searches 
even at lowest stringency due to strongly divergent se- 
quences [112,120]. Searches starting with the five identified 
P, chrysosporium pheromone precursor sequences [112,117] 
were unsuccessful, but sequences from Serpula lacrymans 
(http://genome.jgi-ps£org/SerlaS7_3_2/SerlaS7_3_2.home. 
html) and cross -searches with the detected P. cinnabarinus 
pheromone precursors identified a total of seven potential 
39-to-65-aa-long pheromone precursors. All possess the 
typical CAAX (cysteine-aliphatic-aliphatic-any amino 
acid) motif at the C-terminus and a MDA/DF-motif at 
the A^-terminus (Additional file 23: Figure S8). Three are 
very distinct in sequence, as is typical for B-mating-type 
pheromone precursors, whereas four others share more 
similarity, resembling the precursors of presumed non- 
mating-type pheromone-like peptides [112,115]. 



Conclusions 

The P, cinnabarinus genome contains the genes encod- 
ing the full enzymatic portfolio for lignin degradation, 
notably peroxidases and numerous auxiliary enzymes for 
the generation of hydrogen peroxide. Several laccase- 
encoding gene models and AA2 peroxidases (MnP, LiP 
or VP) were identified in P, cinnabarinus, A large num- 
ber of these genes are expressed as they have been de- 
tected as transcripts in the cDNA library. Furthermore, 
secretome analysis showed effective and differential se- 
cretion of several peroxidases, copper radical oxidases, 
and aryl alcohol, glucose and pyranose oxidases under 
our culture conditions. These genes structurally organize 
into a set of clusters and intron/exon position homolo- 
gies. The physical proximity of these genes {lac, lip, 
mnp, glox) suggests that this organization may result 
from chromosomal rearrangements such as local duplica- 
tions. Furthermore, the different isoenzymes annotated in 
P, cinnabarinus evidenced high diversity in terms of pri- 
mary sequence and predicted biochemical characteristics. 
In P, cinnabarinus, post-transcriptional glycosylation cap- 
abilities appear reduced to the strict minimum, making it a 
promising candidate for heterologous protein production 
in biotechnological applications. In conclusion, P, cinna- 
barinus is shown to be an outstanding and representative 
model white-rot fungi for studying the enzyme machinery 
involved in the degradation and/or transformation of ligno- 
cellulosic materials. 
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